Negative sampling
A common method to estimate the conditional probability (Language model) in word2vec and similar methods.
It is actually a biased estimator. See Noise Contrasive Estimation suggested by Gutmann2010noise. But due to this bias, it estimates another reasonable conditional probability that takes into account the frequency of the target word.